
Data overview
The analysis incorporated 624 stool metagenomes obtained from 11 independent external datasets (see Table 1). Patients were stratified by immunotherapy response into two groups: responders (R group, n = 362; 58.2%) and non-responders (NR group, n = 262; 41.8%). Response assessment followed RECIST 1.1 criteria, with the R group including patients showing complete response (CR), partial response (PR), or stable disease (SD) at 6-month follow-up, while the NR group comprised exclusively progressive disease (PD) cases. The study cohort received various immunotherapy regimens, including anti-PD1, anti-CTLA4, or combination therapies. Cancer type distribution revealed melanoma predominance (n = 456; 72.7%) [Frankel et al. 2017; Gopalakrishnan et al. 2018; Matson et al. 2018; Spencer et al. 2021; Lee et al. 2022; McCulloch et al. 2022; Tsakmaklis et al. 2023], followed by gastrointestinal (GI) cancers (n = 82; 13.1%) [Peng et al. 2020; Heshiki et al. 2020; Gunjur et al. 2024], non-small cell lung cancer (n = 15; 2.4%) [Liu et al. 2022; Heshiki et al. 2020], breast cancer (n = 4; 0.6%) [Heshiki et al. 2020], ovarian cancer (n = 2; 0.3%) [Heshiki et al. 2020], and other malignancies (n = 68; 10.8%) [Gunjur et al. 2024]. The datasets are plotted on a world map as shown in Figure 1.
All samples were collected prior to treatment initiation to evaluate baseline microbiota status. Following quality filtering with fastp, aggregated quality metrics were compiled using MultiQC to summarize the FastQC reports (MultiQC are attached to the report below). Before filtering, approximately ~15 bn reads (MEAN ± SD: 24 ± 15 mln reads per sample) were retained for downstream analysis, with an average of 34 ± 18% of reads per sample removed during quality control (see Table 2 for detailed filtering statistics). A total of 3649 OGUs belonging to 12 bacterial phyla were detected in 624 stool metagenomes. The most numerous of these were Firmicutes (n = 2443), Actinobacteriota (n = 590), Bacteroidota (n = 394), Proteobacteria (n = 140), Desulfobacterota (n = 27) and others (n = 55). OGU abundance profiles of stool samples presented in Table 3.
Marker OGUs discovery
Analysis of marker OGUs identified 637 significantly differentiated features between R and NR experimental groups. Of these, 298 were enriched in R, while 339 were enriched in NR. The most prominent microbial species enriched in R included Blautia wexlerae (n = 49), Faecalibacterium prausnitzii (n = 27), Fusicatenibacter saccharivorans (n = 15), Gemmiger qucibialis (n = 14), and Blautia faecis (n = 8). Conversely, NR exhibited enrichment of distinct microbial taxa, including F. prausnitzii (n = 17), Veillonella parvula (n = 7), and Dialister invisus (n = 7). Although F. prausnitzii was detected in both groups, its significantly higher abundance in responders (27 vs. 17 OGUs) suggests potential strain-level or functional variations contributing to treatment outcome. The robust association of Blautia species, particularly B. wexlerae, with positive response highlights its potential as a predictive biomarker. Notably, the enrichment of oral bacteria Veillonella parvula and Dialister invisus in NR suggests a potential link between oral microbiome and immunotherapy efficacy. Detailed descriptions of marker OGUs are provided in Table 4. Songbird coefficients for each OGU marker are presented in Figure 2.
Log ratio values from the metagenomic samples are detailed in Table 5, with their distribution visualized across experimental groups in Figure 3. Statistical analysis utilizing linear mixed-effects models, accounting for dataset and response variables, revealed significant differences between R and NR groups (p < 0.0001). The inclusion of the dataset variable as a random effect to account for technical and biological heterogeneity between studies was statistically significant (p < 0.0001), justifying the employed modeling approach. Residual plot demonstrated an approximately symmetric, unimodal pattern with no extreme outliers (see Figure 4). This visual assessment, combined with the substantial effect size (Cohen’s d = 0.91; 95% CI [0.74, 1.08]), supports the robustness of the model’s inferences to mild non-normality, aligning with the central limit theorem’s applicability to large-sample analyses. A predictive classifier, based on log ratios and logistic regression with SMOTE balancing and LOGO cross validation with dataset as grouping variable, demonstrated a receiver operating characteristic (ROC) area under the curve (AUC) of 0.72 ± 0.12 (Figure 5), with other performance metrics detailed in Table 6 and Figure 6. The enhanced predictive accuracy of our model for identifying NR suggests that therapeutic success likely depends on a multifactorial interplay extending beyond the gut microbiome, potentially involving intrinsic host factors such as tumor biology (e.g., mutational burden, immune microenvironment) or patient-specific characteristics (e.g., genetic polymorphisms, metabolic status).
Saving 8 x 3 in image

Saving 4 x 3 in image

lmer test results
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: lmer(formula = log_ratio ~ response + (1 | dataset), data = df.log_ratio)
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
R - NR == 0 1.7005 0.1508 11.27 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- bonferroni method)
Dataset variable inportance
ANOVA-like table for random-effects: Single term deletions
Model:
log_ratio ~ response + (1 | dataset)
npar logLik AIC LRT Df Pr(>Chisq)
<none> 4 -1274.6 2557.1
(1 | dataset) 3 -1283.1 2572.3 17.119 1 3.511e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residuals of the model normality testing

Effect size estimation
Cohen's d | 95% CI
--------------------------
-0.91 | [-1.08, -0.74]
- Estimated using pooled SD.
Saving 4 x 4 in image

Saving 4.5 x 3.5 in image

Marker OGUs characterization
As a summarizing tool gene set enrichment analysis (GSEA) was used which identified patterns of change related to microbial taxonomy (phylum, genus) and origin, distinguishing between microbes from non-intestinal body sites (oral cavity, airways, vagina, skin) and environment (food).
Taxonomy
OGUs with 98% nucleotide identity were grouped by taxonomy or other characteristics to facilitate enrichment analysis within specific groups. Results indicated that the Bacteroidetes phylum was upregulated in the R group and Proteobacteria in the NR group (abs. NES > 2, GSEA adj. p < 0.01). Obtained results presented in Figure 7 and Table 7. Bacteroidetes OGU included in core enrichment species presented in Table 8. Notably, enrichment of Proteobacteria OGUs included potential opportunistic species such as Klebsiella quasipneumoniae, Klebsiella michiganensis, Enterobacter ludwigii, Enterobacter kobei, Citrobacter youngae, Citrobacter portucalensis, Citrobacter freundii, and Haemophilus sp001815355 (see Table 9). At the genus level, Blautia, Bacteroides, Fusicatenibacter, Gemmiger, and Faecalibacterium were enriched in the R group, while Veillonella, Prevotella, Dysosmobacter, and Acetatifactor were enriched in the NR group (see Figure 8 and Table 10; GSEA adj. p < 0.01).
Origin
Clustering 3,816 OGUs from this study with 9,412 MAGs derived from different body sites and 10,112 food-derived MAGs revealed that 293 OGUs exhibited complementary to those from non-intestinal sources (see Table 11), and 91 were complementary to food-derived microbes, at 95% nucleotide similarity (Table 12), which corresponds to the species level. Based on the GSEA results, NR-linked OGUs were enriched by food-derived (see Figure 9 and Table 13) and oral (see Figure 10 and Table 14) microbial flora. Core food-derived OGUs comprised Bifidobacterium animalis, Citrobacter freundii, Enterobacter kobei, Enterococcus faecalis, Enterococcus faecium, Escherichia coli, K. michiganensis, K. quasipneumoniae, Lactobacillus gasseri, Limosilactobacillus reuteri, and Megasphaera elsdenii, showing substantial overlap with core enrichment patterns observed in Proteobacteria (Table 15). Core enrichment patterns within the oral cavity included OGU representatives of Veillonellales and Enterobacterales, specifically Anaeroglobus micronuciformis, Escherichia coli, Veillonella parvula, and Haemophilus sp001815355 (Table 16).
Saving 6 x 5.5 in image

Saving 6 x 5.5 in image

Saving 4.5 x 3.5 in image

Saving 4.5 x 3.5 in image

Functional analysis of marker OGU
Functional analysis of marker OGUs was performed using the MetaCerberus pipeline, CAZy and KEGG databases. Logistic regression was used to examine statistical relationships between functional groups and R/NR OGU sets. GSEA was then employed to identify links between CAZy/KEGG functional gene groups and R/NR OGU sets, utilizing p-values generated by the logistic regression.
CAZy
Glycoside hydrolase (GH) functional groups were linked to the R OGU set (NES = 1.4, adj. p = 0.003). Obtained results presented in Figure 11 and Table 17. The core enrichment set included 30 glycoside hydrolase families, such as GH39 (alpha-L-iduronidase), GH42 (beta-galactosidase), GH43 (beta-xylosidase), GH51 (endoglucanase), and GH73 (lysozyme) (see Table 18). The most common microbiota genera associated with these enriched GH families were Blautia (31 genomes), Fusicatenibacter (n = 11), Bacteroides (n=9), and Parabacteroides (n=5) (see Figure 12).
KEGG
Analysis of enriched KEGG pathways revealed a significant association with the R OGU and NR OGU groups (abs. NES > 1, adj. p < 0.01). Obtained results presented in Figure 13 and Table 19. Specifically, the R OGU group exhibited enrichment for eight KEGG pathways, including: ko00340 (Histidine metabolism), ko01210 (2-Oxocarboxylic acid metabolism), ko00290 (Valine, leucine and isoleucine biosynthesis), ko00220 (Arginine biosynthesis), ko01230 (Biosynthesis of amino acids), ko00920 (Sulfur metabolism), ko00040 (Pentose and glucuronate interconversions), and ko01240 (Biosynthesis of cofactors). The NR OGU group, in contrast, showed enrichment for two KEGG pathways: ko00540 (Lipopolysaccharide biosynthesis) and ko02040 (Flagellar assembly). Further analysis revealed distinct bacterial genera associated with these functional pathways (see Figure 14). Blautia and Fusicatenibacter were primarily linked to the pathways enriched in the R OGU marker group, while Veillonella, Dialister, and Enterobacter were associated with ko00540 (Lipopolysaccharide biosynthesis) in the NR group. Similarly, Acetifactor, Roseburia, and Ruminoclostridium were linked to ko02040 (Flagellar assembly) enrichment in the NR group. These findings suggest potential mechanistic links between specific microbial taxa and the observed functional differences between the R OGU and NR OGU groups.
Our previous metagenomic analysis revealed significant enrichment of Wood-Ljungdahl (WL) pathway-encoding MAGs from Blautia, Oliverpabstia, Fusicatenibacter, and Choladousia genera in immunotherapy-responsive melanoma patients [Zakharevich et al., 2024]. In the current study, we identified eight core WL pathway gene clusters (fchA - K01500; cooS, acsA - K00198; acsE - K15023; acsB - K14138; cdhE, acsC - K00197; cdhD, acsD - K00194; rnfC2 - K25008; metV - K25007) that were consistently overrepresented in operational genomic units (OGUs) associated with positive treatment outcomes. Notably, these WL pathway genes showed strong associations with Blautia and Fusicatenibacter MAG sequences (see Figure 15). Functional divergence emerged between R- and NR-associated F. prausnitzii strains (see Table S20). NR-linked strains exhibited marked enrichment in galactose metabolism (ko00052), driven by core genes including galT (K00965), gatY/kbaY (K08302), agaS (K02082), and gatZ/kbaZ (K16371). These strains also overrepresented the N-acetyl-D-galactosamine utilization system (agaF - K02744; agaV - K02745, agaW - K02746, agaE - K02747).
Saving 4.5 x 3.5 in image

Saving 6 x 4 in image

Saving 7.5 x 5 in image

Saving 12.5 x 18.5 in image

Saving 6 x 4 in image

References:
- Frankel, Arthur E., et al. “Metagenomic shotgun sequencing and unbiased metabolomic profiling identify specific human gut microbiota and metabolites associated with immune checkpoint therapy efficacy in melanoma patients.” Neoplasia 19.10 (2017): 848-855.
- Gopalakrishnan, Vancheswaran, et al. “Gut microbiome modulates response to anti–PD-1 immunotherapy in melanoma patients.” Science 359.6371 (2018): 97-103.
- Matson, Vyara, et al. “The commensal microbiome is associated with anti–PD-1 efficacy in metastatic melanoma patients.” Science 359.6371 (2018): 104-108.
- Spencer, Christine N., et al. “Dietary fiber and probiotics influence the gut microbiome and melanoma immunotherapy response.” Science 374.6575 (2021): 1632-1640.
- Lee, Karla A., et al. “Cross-cohort gut microbiome associations with immune checkpoint inhibitor response in advanced melanoma.” Nature Medicine 28.3 (2022): 535-544.
- McCulloch, John A., et al. “Intestinal microbiota signatures of clinical response and immune-related adverse events in melanoma patients treated with anti-PD-1.” Nature Medicine 28.3 (2022): 545-556.
- Tsakmaklis, Anastasia, et al. “TIGIT+ NK cells in combination with specific gut microbiota features predict response to checkpoint inhibitor therapy in melanoma patients.” BMC Cancer 23.1 (2023): 1160.
- Liu, Ben, et al. “Exploring gut microbiome in predicting the efficacy of immunotherapy in non-small cell lung cancer.” Cancers 14.21 (2022): 5401.
- Heshiki, Yoshitaro, et al. “Predictable modulation of cancer treatment outcomes by the gut microbiota.” Microbiome 8 (2020): 1-14.
- Gunjur, Ashray, et al. “A gut microbial signature for combination immune checkpoint blockade across cancer types.” Nature Medicine 30.3 (2024): 797-809.
- Peng, Zhi, et al. “The gut microbiome is associated with clinical response to anti–PD-1/PD-L1 immunotherapy in gastrointestinal cancer.” Cancer Immunology Research 8.10 (2020): 1251-1261.
- Zakharevich, Natalia V., et al. “Systemic metabolic depletion of gut microbiome undermines responsiveness to melanoma immunotherapy.” Life Science Alliance 7.5 (2024).